14 research outputs found

    Link Prediction via Community Detection in Bipartite Multi-Layer Graphs

    Get PDF
    International audienceThe growing number of multi-relational networks pose new challenges concerning the development of methods for solving classical graph problems in a multi-layer framework, such as link prediction. In this work, we combine an existing bipartite local models method with approaches for link prediction from communities to address the link prediction problem in multi-layer graphs. To this end, we extend existing community detection-based link prediction measures to the bi-partite multi-layer network setting. We obtain a new generic framework for link prediction in bipartite multi-layer graphs, which can integrate any community detection approach, is capable of handling an arbitrary number of networks, rather inexpensive (depending on the community detection technique), and able to automatically tune its parameters. We test our framework using two of the most common community detection methods, the Louvain algorithm and spectral partitioning, which can be easily applied to bipartite multi-layer graphs. We evaluate our approach on benchmark data sets for solving a common drug-target interaction prediction task in computational drug design and demonstrate experimentally that our approach is competitive with the state-of-the-art

    Link Prediction via Community Detection in Bipartite Multi-Layer Graphs

    Get PDF
    International audienceThe growing number of multi-relational networks pose new challenges concerning the development of methods for solving classical graph problems in a multi-layer framework, such as link prediction. In this work, we combine an existing bipartite local models method with approaches for link prediction from communities to address the link prediction problem in multi-layer graphs. To this end, we extend existing community detection-based link prediction measures to the bipartite multi-layer network setting. We obtain a new generic framework for link prediction in bipartite multi-layer graphs, which can integrate any community detection approach, is capable of handling an arbitrary number of networks, rather inexpensive (depending on the community detection technique), and able to automatically tune its parameters. We test our framework using two of the most common community detection methods, the Louvain algorithm and spectral partitioning, which can be easily applied to bipartite multi-layer graphs. We evaluate our approach on benchmark data sets for solving a common drug-target interaction prediction task in computational drug design and demonstrate experimentally that our approach is competitive with the state-of-the-art

    Prédiction de liens dans les réseaux bipartis multicouche, avec une application à la prédiction d’interaction médicament-cible thérapeutique

    No full text
    Many aspects from real life with bi-relational structure can be modeled as bipartite networks. This modeling allows the use of some standard solutions for prediction and/or recommendation of new relations between these objects in such networks. Known as the link prediction task, it is a widely studied problem in network science for single graphs, networks assuming one type of interaction between vertices. For multi-layer networks, allowing more than one type of edges between vertices, the problem is not yet fully solved.The motivation of this thesis comes from the importance of an application task, drug-target interaction prediction. Searching valid drug candidates for a given biological target is an essential part of modern drug development. In this thesis, the problem is modeled as link prediction in a bipartite multi-layer network. Modeling the problem in this setting helps to aggregate different sources of information into one single structure and as a result to improve the quality of link prediction.The thesis mostly focuses on the problem of link prediction in bipartite multi-layer networks and makes two main contributions on this topic.The first contribution provides a solution for solving link prediction in the given setting without limiting the number and type of networks, the main constrains of the state of the art methods. Modeling random walk in the fashion of PageRank, the algorithm that we developed is able to predict new interactions in the network constructed from different sources of information. The second contribution, which solves link prediction using community information, is less straight-forward and more dependent on fixing the parameters, but provides better results. Adopting existing community measures for link prediction to the case of bipartite multi-layer networks and proposing alternative ways for exploiting communities, the method offers better performance and efficiency. Additional evaluation on the data of a different origin than drug-target interactions demonstrate the genericness of proposed approach.In addition to the developed approaches, we propose a framework for validation of predicted interactions founded on an external resource. Based on a collection of biomedical concepts used as a knowledge source, the framework is able to perform validation of drug-target pairs using proposed confidence scores. An evaluation of predicted interactions performed on unseen data shows effectiveness of this framework.At the end, a problem of identification and characterization of promiscuous compounds existing in the drug development process is discussed. The problem is solved as a machine learning classification task. The contribution includes graph mining and sampling approaches. In addition, a graphical interface was developed to provide feedback of the result for experts.De nombreux problèmes réels relèvent d’une structure bi-relationnelle et peuvent être modélisés suivant des réseaux bipartis. Une telle modélisation permet l'utilisation de solutions standards pour la prédiction et/ou la recommandation de nouvelles relations entre objets de ces réseaux. La tâche de prédiction de liens est un problème largement étudié dans les réseaux simples, c’est-à-dire les réseaux avec un seul type d'interaction entre sommets. Cependant, pour les réseaux multicouche (i.e. réseaux avec plusieurs types d'arêtes entre sommets), ce problème n'est pas encore entièrement résolu.Cette thèse est motivée par l'importance d'une tâche réelle, à savoir la prédiction d'interaction entre un médicament et une cible thérapeutique. La recherche de candidats médicaments prometteurs pour une cible thérapeutique biologique donnée est une partie essentielle de la conception d’un médicament moderne. Dans cette thèse, nous modélisons ce problème comme une tâche de prédiction de lien dans un réseau multicouche biparti. Cette modélisation du problème permet de rassembler différentes sources d'information en une seule structure et ainsi d'améliorer la qualité de la prédiction d’un lien.Cette thèse se concentre sur le problème de la prédiction de liens dans les réseaux multicouches bipartis et apporte deux contributions principales à ce sujet. La première contribution est une solution pour résoudre la prédiction de liens sans limiter le nombre et le type de réseaux, ce qui est le principal défaut des méthodes de l'état de l'art. L'algorithme que nous avons développé modélise une marche aléatoire à la manière du PageRank et est capable de prédire de nouvelles interactions dans le réseau que nous construisons à partir de différentes sources d'information. La deuxième contribution, qui porte aussi sur ce problème, s’appuie sur les méthodes de détection de communautés. Cette solution, moins immédiate et plus dépendante du choix des valeurs des paramètres, donne de meilleurs résultats. Pour cela, nous adaptons des mesures utilisées pour la détection de communautés à la problématique de la prédiction de liens dans les réseaux multicouche bipartis et nous développons de nouvelles méthodes associant des communautés pour la prédiction de liens. Nous évaluons aussi nos méthodes sur des données autres que celles des interactions entre médicaments et cibles thérapeutiques montrant ainsi le caractère générique de notre approche.D’autre part, nous proposons un protocole expérimental de validation des interactions prédites reposant sur l’exploitation de ressources externes. Fondé sur une collection de concepts biomédicaux utilisés comme source de connaissances, ce protocole effectue une validation des paires de médicaments-cibles thérapeutiques qui sont prédites à partir de scores de confiance que nous avons définis. Une évaluation des interactions prédites sur des données tests montre l'efficacité de ce protocole.Enfin, nous nous intéressons au problème de l'identification et de la caractérisation de composés promiscues qui existe dans le processus de développement de médicaments. Nous modélisons ce problème comme une tâche de classification et le résolvons par l'apprentissage automatique. Notre contribution repose sur une approche d'exploration de graphes et d'échantillonnage. De plus, nous avons développé une interface graphique pour fournir un retour d'information aux experts sur les résultats

    What can connectivity characteristics of networks tell us about the quality of link predictions?

    No full text
    International audienceLink prediction in networks works better when those networks are connected and not sparse. But can we use common connec-tivity characteristics to decide once a network is well enough connected to allow a random walk process to predict links best? Recent results in our work on link prediction lead us to ask this question and we attempt to shed some light on it. We do this by combining networks stemming from different data sources into networks combining different numbers of layers, and connecting their connectivity characteristics to the AUC that can be achieved by a random walk algorithm for link prediction. What we find is that it seems to be very important to reduce the radius and diameter of the network as much as possible, and get close to having a single connected component in the network. We also argue that the five benchmark data sets that have been used in the literature on drug-target activity prediction might be too easy to allow meaningful evaluations

    PrePeP: A light-weight, extensible tool for predicting frequent hitters

    No full text
    International audienceWe present PrePeP, a lightweight tool for predicting whether molecules are frequent hitters, and visually inspecting the subgraphs supporting this decision. PrePeP is contains three modules: a mining component , an encoding/predicting component, and a graphical interface, all of which are easily extensible

    Link Prediction via Community Detection in Bipartite Multi-Layer Graphs

    No full text
    International audienceThe growing number of multi-relational networks pose new challenges concerning the development of methods for solving classical graph problems in a multi-layer framework, such as link prediction. In this work, we combine an existing bipartite local models method with approaches for link prediction from communities to address the link prediction problem in multi-layer graphs. To this end, we extend existing community detection-based link prediction measures to the bi-partite multi-layer network setting. We obtain a new generic framework for link prediction in bipartite multi-layer graphs, which can integrate any community detection approach, is capable of handling an arbitrary number of networks, rather inexpensive (depending on the community detection technique), and able to automatically tune its parameters. We test our framework using two of the most common community detection methods, the Louvain algorithm and spectral partitioning, which can be easily applied to bipartite multi-layer graphs. We evaluate our approach on benchmark data sets for solving a common drug-target interaction prediction task in computational drug design and demonstrate experimentally that our approach is competitive with the state-of-the-art

    LPbyCD: a new scalable and interpretable approach for Link Prediction via Community Detection in bipartite networks

    No full text
    International audienceMany aspects from real life with bi-relational structure can be modeled as bipartite networks. This modeling allows the use of some standard solutions for prediction and/or recommendation of new relations between objects in such networks. In this work, we combine an existing bipartite local models method with approaches for link prediction from communities to address the link prediction problem in this type of networks. The motivation of this work stems from the importance of an application task, drug–target interaction prediction. Searching valid drug candidates for a given biological target is an essential part of modern drug development. We model the problem as link prediction in a bipartite multi-layer network, which helps to aggregate different sources of information into one single structure and as a result improves the quality of link prediction. We adapt existing community measures for link prediction to the case of bipartite multi-layer networks, propose alternative ways for exploiting communities, a nd show experimentally that our approach is competitive with the state-of-the-art. We also demonstrate the scalability of our approach and assess interpretability. Additional evaluations on data of a different origin than drug–target interactions demonstrate the genericness of the proposed approach

    PrePeP – A Tool for the Identification and Characterization of Pan Assay Interference Compounds

    No full text
    International audiencePan Assays Interference Compounds (PAINS) are a significant problem in modern drug discovery: compounds showing non-target specific activity in high-throughput screening can mislead medicinal chemists during hit identification, wasting time and resources. Recent work has shown that existing structural alerts are not up to the task of identifying PAINS. To address this short-coming, we are in the process of developing a tool, PrePeP, that predicts PAINS, and allows experts to visually explore the reasons for the prediction. In the paper, we discuss the different aspects that are involved in developing a functional tool: systematically deriving structural descriptors, addressing the extreme imbalance of the data, offering visual information that pharmacological chemists are familiar with. We evaluate the quality of the approach using benchmark data sets from the literature and show that we correct several short-comings of existing PAINS alerts that have recently been pointed out
    corecore